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ABSTRACT 


The  Internet  Protocol  (IP)  has  emerged  as  the  dominant  technology  for 
determining  how  data  is  routed  across  the  Internet.  Because  IP  flows  are  defined 
essentially  in  terms  of  origin-destination  (O-D)  pairs,  we  represent  IP  traffic 
engineering  as  a  multi-commodity  flow  problem  in  which  each  O-D  pair  is  treated 
as  a  separate  commodity.  We  account  for  the  diversity  in  IP  routing  by  modeling 
opposite  extremes  of  traffic  engineering:  “naive”  traffic  engineering  where  the  IP 
routes  data  between  any  two  users  using  only  the  shortest  path  between  them, 
and  “best  case”  traffic  engineering  where  IP  has  the  flexibility  to  route  data  using 
multiple  paths  in  the  network  regardless  of  their  length.  We  develop  linear 
programming  formulations  that  identify  the  maximum  data  flow  for  an  IP  network 
that  satisfies  proportionality  constraints  for  traffic  demand  for  each  case  of  traffic 
engineering,  and  we  also  determine  the  optimal  interdiction  of  those  flows  that 
reduces  that  maximum  flow  in  the  worst  possible  way. 
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EXECUTIVE  SUMMARY 


The  objective  of  this  thesis  is  to  provide  a  quantitative  means  to  assess 
the  carrying  capacity  of  an  Internet  Protocol  (IP)  based  network  under  a  general 
model  for  traffic  demands,  as  well  as  identify  the  node  and/or  arc  attacks  that 
interrupt  traffic  flows  in  the  worst  possible  manner. 

Over  the  last  decade  the  Internet  has  become  a  critical  infrastructure  to 
our  way  of  life.  Internet  Service  Providers  are  the  owners  and  operators  of  the 
computer  networks  that  collectively  afford  the  general  public,  schools, 
businesses,  government,  and  military  organizations,  access  to  the  Internet  and 
its  evolving  applications.  Network  operators  have  developed  explicit  and  implicit 
mechanisms  for  influencing  the  way  in  which  IP  traffic  travels  across  their 
networks.  This  process  is  known  as  traffic  engineering. 

We  formulate  a  model  representing  “naive”  traffic  engineering  where  IP 
routes  data  for  each  origin-destination  pair  using  only  a  single  shortest  path  in 
the  network.  We  desire  to  maximize  this  total  amount  of  data  flow  by  raising  flow 
along  every  path  in  a  proportional  manner  until  one  of  the  internal  nodes  and/or 
connecting  arcs  reaches  capacity.  Next  we  formulate  a  model  representing  “best 
case”  traffic  engineering  where  IP  has  the  flexibility  to  route  data  using  multiple 
paths  in  the  network  regardless  of  length.  We  maximize  the  sum  of  the  flows  on 
artificial  return  arcs  by  increasing  flow  along  all  of  them  in  proportion  to  each 
other  until  one  of  the  arcs  in  the  network  reaches  its  capacity. 

ISPs  are  susceptible  to  many  types  of  attacks,  both  physical  and  cyber,  to 
their  key  components.  The  models  developed  here  identity  locations  of  attacks 
that  have  the  most  negative  impact  on  the  performance  of  the  ISP. 

The  analysis  here  focuses  on  Abilene,  the  high-speed  backbone  of  the 
Internet2  educational  network,  a  not-for-profit  advanced  networking  consortium  of 
universities,  laboratories,  and  government  agencies.  We  perform  our  analysis  on 
a  network  representation  of  Abilene  with  node  and/or  arc  capacities.  We  compute 
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the  total  amount  of  traffic  routed  between  customers,  the  overall  flow  through  the 
network,  and  the  utilization  of  Abilene’s  transshipment  routers  using  both  the 
naive  and  the  best  case  traffic  engineering  formulations.  We  also  identify  the 
optimal  node  and  arc  attacks  that  affect  the  total  amount  of  traffic  routed  between 
customers  and  the  flow  through  the  network  in  the  worst  possible  way.  We  find 
that  Abilene  is  well-provisioned  in  the  sense  that  it  tends  to  be  the  arcs,  in 
particular  the  customer  access  links,  that  saturates  data  flow  in  the  network,  a 
generalization  that  is  consistent  with  our  results. 

The  models  and  analysis  in  this  thesis  are  applicable  to  any  ISP  network. 
The  general  public,  businesses,  civilian  and  military  organizations  rely  heavily  on 
these  networks.  As  the  reliance  grows,  so  will  the  need  for  understanding  an 
ISP’s  limitations  and  vulnerability  to  attacks. 
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I.  INTRODUCTION 


A.  BACKGROUND 

Over  the  last  decade  the  Internet  has  become  an  infrastructure  that  is 
critical  to  supporting  our  way  of  life.  People  throughout  the  world  rely  on  the 
Internet  as  a  means  for  personal  communication  through  the  use  of  email,  instant 
messaging,  or  chat  rooms.  Students  have  access  to  limitless  amounts  of 
information  stored  on  the  Internet  on  any  topic  imaginable.  Co-workers  are  able 
to  share  information  and  conduct  business  in  unprecedented  manners.  The  Navy 
Marine  Corps  Intranet  and  the  Army’s  LandWarNet  provide  service  members  on 
all  command  levels  with  secure  platforms  for  information  sharing  amongst 
military  installations  and  forward-deployed  forces  throughout  the  world. 

Internet  Service  Providers  (ISPs)  are  the  owners  and  operators  of  the 
computer  networks  that  collectively  provide  the  general  public,  schools, 
businesses,  government,  and  military  organizations,  access  to  the  Internet  and 
its  evolving  applications. 

The  operation  of  the  Internet  is  determined  by  protocols  which  specify  the 
roles,  rules,  and  responsibilities  for  individual  technologies.  Among  these,  the 
Internet  Protocol  (IP)  has  emerged  as  the  dominant  technology  for  determining 
how  an  ISP  routes  traffic  across  its  part  of  the  Internet  from  one  customer  to 
another. 

Network  operators  have  developed  explicit  and  implicit  mechanisms  for 
influencing  the  way  in  which  IP  traffic  travels  across  their  networks.  This  process, 
known  as  “traffic  engineering,”  allows  the  network  operator  to  tune  the 
performance  of  their  network  in  response  to  changing  traffic  levels  or 
environmental  conditions.  The  two  main  protocols  used  to  for  traffic  engineering 
of  IP  within  a  single  ISP  are  Open  Shortest  Path  First  (OSPF)  and  Intermediate 
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System-Intermediate  System  (IS-IS),  both  of  which  compute  shortest  paths 
based  on  configurable  link  weights  (see  Rexford,  2006  and  references  therein). 

B.  RESEARCH  OBJECTIVES  AND  MODELING  APPROACH 

The  objective  of  this  thesis  is  to  provide  a  quantitative  means  to  assess 
the  carrying  capacity  of  an  IP-based  network  under  general  traffic  demands,  and 
then  to  identify  the  node  and/or  arc  attacks  that  interrupt  traffic  flows  in  the  worst 
possible  manner.  Such  tools  will  lead  to  a  better  understanding  of  the  system- 
wide  vulnerabilities  of  real  IP  networks,  as  well  as  provide  guidelines  for  network 
protection.  We  measure  the  performance  of  a  given  network  in  terms  of  the 
maximum  traffic  levels  that  it  can  support.  We  identify  network  vulnerabilities  by 
determining  the  attack(s)  to  network  components  that  reduce  its  maximum 
carrying  capacity  in  the  worst  possible  way. 

We  represent  IP  traffic  flow  using  a  “gravity  model”  for  traffic  demand, 
which  states  that  the  amount  of  traffic  exchanged  between  two  users  is 
proportional  to  the  total  amount  of  traffic  entering  and  exiting  each  of  those  users 
(Alderson  et  al .,  2006).  Thus,  the  gravity  model  assumes  that  demand  for  traffic 
is  proportional  to  the  product  of  the  “size”  of  the  two  users.  In  practice,  the  actual 
traffic  levels  (i.e. ,  data  flow  between  users)  need  not  be  proportional,  even  when 
the  demands  follow  the  gravity  model.  However,  we  assume  that  traffic  levels 
occur  in  proportion  to  demand,  which  is  an  extreme  type  of  “fairness”  that  we 
impose.  The  idea  is  to  provide  a  share  of  network  resources  (e.g.,  transshipment 
router  bandwidth  throughput  capacity)  to  each  user  based  on  their  size. 

IP  traffic  engineering  varies  from  ISP  to  ISP  and  depends  on  the 
technologies  and  polices  in  use.  For  example,  it  may  be  the  intent  of  the  ISP  to 
minimize  end-to-end  traffic  delay,  or  maximize  utilization  of  network  resources,  or 
maximize  “customer  satisfaction.”  Some  ISP  users  may  receive  preferential 
access  to  network  resources,  with  the  other  users  sharing  what  remains.  So  we 
model  two  opposite  extremes  of  traffic  engineering  alternatives.  We  first 
formulate  a  model  representing  “naive”  traffic  engineering  where  IP  routes  data 
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for  each  origin-destination  pair  using  only  a  single  shortest  path  in  the  network. 
This  policy  is  easy  to  implement  but  tends  to  underutilize  network  resources.  The 
second  formulation  represents  “best  case”  traffic  engineering  where  IP  has  the 
flexibility  to  route  data  using  multiple  paths  in  the  network  regardless  of  length. 
This  policy  yields  a  higher  utilization  of  resources  but  is  more  complicated  to 
implement  and  manage,  and  is  an  upper  bound  on  achievable  performance. 

We  represent  a  particular  ISP  as  a  network  by  considering  its  router-level 
map.  Nodes  in  the  network  correspond  to  routing  devices,  and  arcs  between 
routers  correspond  to  direct  connectivity  as  seen  by  IP.  For  simplicity,  we 
assume  that  connections  between  nodes  correspond  to  physical  connectivity, 
although  this  may  not  be  the  case.  We  also  consider  the  network  capacities  in 
the  form  of  connection  speeds  for  arcs,  and  router  throughput  bandwidth 
capacities  (Alderson  et  al.,  2005). 

We  develop  linear  programming  (LP)  models  that  allow  us  to  analyze  the 
maximum  carrying  capacity  of  an  ISP  under  a  gravity  model  of  user  traffic 
demand.  The  models  also  examine  the  utilization  of  the  ISP’s  components  (i.e. , 
routers  and  their  arcs),  as  well  as  identify  the  bandwidth  limitations  on  those 
components.  ISPs  are  susceptible  to  many  types  of  attacks,  both  physical  and 
cyber,  to  their  key  components  (Doyle  et  al.,  2005).  The  models  developed  here 
identify  the  attacks  that  have  the  biggest  impact  on  the  performance  of  the  ISP. 

C.  LITERATURE  REVIEW  OF  PREVIOUS  WORK 

The  study  of  network  vulnerability  problems  is  not  new.  For 
telecommunications,  considerable  effort  has  been  directed  at  the  analysis  of  the 
physical  infrastructure,  in  particular  the  design  of  fiber  optic  networks 
(Henningsson  et  al.,  2006).  Grotschel  et  al.  (1995)  present  a  general  framework 
for  the  design  of  “survivable”  communication  networks,  including  the  study  of 
minimum  spanning  trees,  Steiner  trees,  and  minimum  cost  /(-connected  network 
design  problems.  An  updated  treatment  of  the  problem  can  be  found  in  Kerivin 
and  Mahjoub  (2005). 
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Much  of  the  work  in  network  vulnerability  and  survivability  has  its  roots  in 
graph  theory,  in  which  the  network  is  represented  solely  in  terms  of  its 
connections  (without  any  annotations  or  domain-specific  data),  and  considerable 
effort  is  devoted  to  assessing  various  measures  of  global  connectivity.  These 
include  network  diameter  (i.e. ,  average  length  of  shortest  path  between  any  two 
nodes),  characteristic  path  length  (i.e.,  the  average  distance  along  any  path 
between  any  two  nodes),  or  the  size  and  distribution  of  connected  clusters. 
Recently,  these  graph  theoretic  measures  have  been  applied  to  the  Internet,  and 
many  studies  have  focused  on  how  these  connectivity  patterns  change  in  the 
presence  of  accidental  or  intentional  graph  losses  (Albert  et  al.,  2000,  Cohen  et 
al.,  2000,  Cohen  at  al.,  2001,  Bollobas  and  Riordan  2003,  Crucitti  et  al.,  2004). 
As  discussed  in  Alderson  (2008),  a  general  problem  with  this  approach  is  that 
any  notion  of  network  “function”  is  being  approximated  (often  poorly)  by  these 
simple  graph  theoretic  measures. 

The  vulnerability  of  router-level  Internet  networks  was  discussed  by  Doyle 
et  al.  (2005),  who  showed  that  previous  results  by  Albert  et  al.  (2000),  which 
focused  on  connectivity  patterns  and  focused  on  critical  high-degree  hubs,  were 
not  relevant  to  the  real  Internet.  In  contrast,  they  considered  the  need  to 
maximize  flow  on  the  part  of  the  ISP  and  formulated  a  simple  path-based  model 
of  network  throughput,  described  here  as  the  “single-path”  model.  However,  their 
consideration  of  “worst  case”  attacks  on  network  routers  was  myopic  and 
heuristic,  in  that  it  simply  ranked  nodes  in  a  prioritized  list  in  terms  of  the  effect 
their  removal  would  have  on  overall  network  throughput.  They  did  not  consider 
attacks  that  were  formally  optimal,  nor  did  they  consider  more  sophisticated 
models  of  traffic  engineering  that  underlie  real  IP  networks. 

Recent  effort  has  been  devoted  to  the  application  of  optimal  network 
interdiction  to  critical  infrastructure  protection  (Brown  at  al.,  2006).  This  thesis 
continues  that  effort  and  formalizes  the  notion  of  an  optimal  attack  for  a 
maximum  proportional  flow  problem  and  provides  analysis  and  computational 
implementation  to  solve  it  efficiently. 
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D.  STRUCTURE  OF  THESIS  AND  CHAPTER  OUTLINE 

The  reminder  of  this  thesis  is  organized  as  follows:  In  Chapter  II  we 
formulate  two  LP  models,  representing  alternative  approaches  to  traffic 
engineering  discussed  above.  In  Chapter  III  we  use  these  models  to  perform  a 
detailed  analysis  of  Abilene,  the  backbone  for  the  Internet2  academic  network. 
Finally  in  Chapter  IV  we  summarize  the  contributions  of  the  thesis  and  offer 
suggestions  for  future  research. 
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II.  MODEL  FORMULATION 


Because  IP  flows  are  defined  essentially  in  terms  of  Source  (node  s)  and 
Terminal  (node  t )  node  pairs,  we  represent  IP  traffic  engineering  as  a  multi- 
commodity  flow  problem  in  which  each  s-t  pair  is  treated  as  a  separate 
commodity.  In  our  network,  the  “edge”  nodes  (i.e.,  the  nodes  the  provide  network 
access  to  the  users  and  connect  to  the  internal  nodes)  represent  the  users.  We 
will  assume  that  all  users  communicate  with  one  another,  and  that  the  demand 
for  flow  between  user  pairs  is  proportional  to  the  product  of  their  capacities,  an 
assumption  consistent  with  the  aforementioned  gravity  model  of  traffic  demand. 
The  “internal”  nodes  (i.e.,  nodes  that  provide  connectivity  to  the  other  network 
devices)  represent  intra-network  routing  devices  (i.e.,  “routers”),  and  arcs 
connecting  them  represent  one-hop  IP  connectivity  between  routers  (i.e.,  routers 
directly  “see”  one  another  according  to  IP). 

The  primary  problem  is  to  identify  the  maximum  flow  (and  corresponding 
optimal  routing)  for  a  multi-commodity  network  that  satisfies  the  proportionality 
constraints  for  flow  demand  as  well  as  capacity  constraints  on  nodes  and  arcs. 

The  secondary  problem  is  to  identify  the  optimal  interdiction  of  those  flows 
that  reduces  that  maximum  flow  in  the  worst  possible  way. 

We  consider  two  approaches  to  traffic  engineering.  First,  we  consider  a 
strict  approach  where  each  commodity  follows  a  single  shortest  path.  Second, 
we  will  look  at  best  case  traffic  engineering  in  which  it  is  possible  to  route  traffic 
through  the  network  by  taking  multiple,  possibly  longer,  paths. 

A.  SINGLE  PATH  MULTI-COMMODITY  MAXIMUM  FLOW 

1.  Solving  for  Maximum  Flow 

This  model  represents  the  simplest  form  of  traffic  engineering.  Traffic  from 
user  s  to  user  t  follows  a  single  path  in  the  network.  That  path  is  the  computed 
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shortest  path,  in  terms  of  the  number  of  internal  nodes  visited  (or  arcs  traversed), 
from  user  s,  to  user  t.  The  total  amount  of  data  flow  through  the  network  is  the 
sum  of  the  traffic  routed  along  all  of  the  shortest  s-t  paths.  There  exist  flow 
throughput  capacities  on  the  network’s  “internal”  nodes  and/or  the  arcs 
connecting  them.  We  desire  to  maximize  this  total  amount  of  data  flow  by  raising 
flow  along  every  path  in  a  proportional  manner  until  one  of  the  internal  nodes 
and/or  connecting  arcs  reaches  capacity.  We  refer  to  the  network  components 
that  reach  capacity  as  "bottlenecks." 

Formulation  1:  MAX  SP  (Maximizing  Single-Path  Flow) 

Index  Use 

i,j,keN  Nodes 

(/',/)  e  A  Directed  arc  from  node  /  to  node  j 

s,teE^N  Source  and  terminal  nodes  in  the  set  of  “edge” 

nodes  E 

Data 

Ds  Traffic  demand  by  edge  node  s  e  E  [flow] 

Bk  Throughput  capacity  of  node  ke  N  [flow] 

ut  J  Upper  bound  on  flow  from  node  /  to  node  j 

for  each  arc  (/,  j)  e  A  [flow] 

rs’'  Shortest  path  route  from  node  s  to  node  t  for 

each  s,  t  <=  E  [flow] 
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Calculated  Data 


Decision  Variable 

I5'1 


Binary  indicator  whether  node  k  is  on  the 

shortest  path  from  node  s  to  node  t  for  s,  t  e  E 
[binary] 

( 1  if  node  k  is  on  rs'' 

= 

y0  if  node  k  is  not  on  rs’1 

Binary  indicator  whether  arc  (/,  j)  is  on  the 

shortest  path  from  node  s  to  node  t  for  s,  t  e  E 
[binary] 

ft  if  arc(z,  j)  is  on  rSJ 

qt  = 

^0  if  arc(/,/)is  not  on  rs,t 

Flow  along  route  rs,t  from  node  s  to  node  t 
[flow] 

Xst=pBsBt  where  p  is  a  constant  of 
proportionality 
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Formulation 


max  V  (C1.0) 

P  .'TT'f 


s,  teE 

I  <  B, 

s,  teE 

\/k^N 

(Cl.l) 

I  x'J<j  * 

s,  t&E 

(Cl. 2) 

Xs,t  =  pBsBl 

V(5,r)eii  xE 

(Cl. 3) 

(NOTE:  Throughout  the  thesis,  n  denotes  the  number  of  nodes,  e  denotes  the 
number  of  edge  nodes,  and  m  denotes  the  number  of  arcs.) 

Discussion 

Equation  (C1.0)  is  the  objective  function  which  represents  total  amount  of 
data  flow  through  the  network.  It  is  the  sum  of  the  traffic  routed  along  all  of  the 
shortest  s-t  paths.  We  maximize  the  objective  function  value  by  increasing  the 
proportionality  constant  p.  Equations  (C  1.1)  and  (Cl. 2)  limit  the  amount  of  flow 
through  each  node  and  arc  respectively.  Equation  (Cl. 3)  ensures  that  flow  is 
routed  between  each  s-t  pair,  and  that  those  flows  are  raised  in  proportion  to 
each  other. 

There  is  considerable  preprocessing  involved  in  solving  for  single-path 
maximum  flow.  We  compute  the  rs,t  values  using  the  Floyd-Warshall  algorithm 
(Appendix  A).  Floyd-Warshall  determines  the  shortest  paths  between  all  node 
pairs,  but  we  are  only  interested  the  shortest  paths  between  each  s-t  node  pair. 

Once  the  shortest  path  routes  are  determined,  we  use  it  to  build  a  matrix 

k - >eN 

(Appendix  B) 


(size:  e2  x  n)  of  r/-'  values 

(s,t) 
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(ij) - 


and  a  matrix  (size  :  e2  x  m)  of  a"'1,  values 

CM) 


Vt, 


(Appendix  C). 


Each  of  the  three  tasks  mentioned  above  runs  in  0(n3). 

Our  formulation  allows  us  to  study  networks  in  which  only  the  nodes  are 
capacitated  ( Bk<oo  ,  Uj  J=cc),  or  when  just  the  arcs  are  capacitated  ( Bk  =  qo  , 

uj  j  <  oo  ),  or  when  both  nodes  and  arcs  are  capacitated  ( Bk  <  qo  ,  uUj  <  qo  ). 

The  special  structure  associated  with  the  constant  of  proportionality  p 
affords  a  direct  analytic  solution  to  the  maximum  flow  under  single-path  routing. 
For  a  network  with  capacitated  nodes  and  un-capacitated  arcs,  consider 
equations  (Cl  .1)  and  (Cl  .3). 

2  XS,,K’‘  <  Bk  VkeN  (Cl.l) 


Xs,=pBsB, 

Equation  (Cl  .1 )  can  be  rewritten  as 

ZpbA  <■' s 


V(s,t)GExE  (Cl. 3) 


\/keN 


or 


P  ^ 


B, 


Z'fB.B, 


\/keN 


Now  we  can  solve  for  p  directly. 


p  =  min 

k 


B, 


Zr"B.B, 


>  0 


VkeN  (1) 
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For  a  network  with  capacitated  arcs  and  un-capacitated  nodes,  we  solve  for  p 
using  equations  (Cl. 2)  and  (Cl. 3)  and  performing  the  same  substitution.  Solving 
for  p  in  this  type  of  network  yields  the  following  result 


P 


min 


hj 


z 

V  s> 1 


q-'BA 


>  o 


V(/,/)e  A  (2) 


For  networks  where  both  the  nodes  and  arcs  have  capacity,  the  correct  p  is  the 
minimum  p  between  equations  (1)  and  (2).  This  type  of  solution  is  easily 
implemented  in  a  spreadsheet  program  such  as  EXCEL. 


2.  Minimizing  the  Maximum  Flow 

Suppose  an  opponent  (an  attacker )  wants  to  incur  the  greatest  amount  of 
“damage”  on  the  network.  Assume  that  the  attacker  has  the  capability  to  destroy 
a  limited  number  of  nodes  and/or  arcs,  thus  reducing  to  zero  the  capacity  for 
each  of  the  destroyed  nodes  and/or  arcs  ( Bk  =0  and/or  uij  =0).  The  attacker  must 

decide  which  nodes  and/or  arcs  in  the  network  to  destroy  so  that  the  maximum 
flow  is  minimized,  perhaps  to  zero. 

The  previous  formulation  is  the  same,  with  the  addition  of  the  following 
data  and  decision  variables. 

Formulation  2:  MIN-MAX  SP  (Minimizing  the  maximizing  Single-Path 

Flow) 

Data 

attacks  Number  of  nodes  and/or  arcs  that  the  attacker 

can  destroy  [cardinality] 


12 


Decision  Variable 


Yk  Binary  indicator  for  attacker  destruction  of  node 

k  e  N  [binary] 

ft  if  node  k  is  destroyed 
0  otherwise 


Y.  .  Binary  indicator  for  attacker  destruction  of 

arc  (/',  j)  e  A  [binary] 

|l  if  arc(i,j)  is  destroyed 
'■ '  0  otherwise 


Min-Max  optimization  of  flow 


max  V 

p  rrtv 


s.t. 


mim 

7eT 


£  jt'v  s  jsva-n) 


•S’,  teE 


Xst=pBs(l-Ys)BXl-Yt) 


(C2.0) 

(C2.1) 

V(/,/)ed  (C2.2) 

V(s,r)e£x£  (C2.3) 


where  Y  e  T  =  < 


Yt  j  <  attacks 

keN  ^  (7,/M 

Y  =Y  . 

hj  j,i 

Yk,Yue{ 0,1}  V i,  j,  k 


(C2.4) 
(C2.5)  > 
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Discussion 


Equation  (C2.0),  the  objective  function,  reflects  that  the  attacker  desires  to 
minimize  the  previously  maximized  sum  of  traffic  routed  along  all  of  the  shortest 
s-t  paths  in  the  network.  The  attacker  will  seek  to  destroy  nodes,  or  arcs,  or  both 
depending  on  the  network’s  structure  (i.e. ,  which  components  are  capacitated). 
Equations  (C2.1)  and  (C2.2)  limit  the  amount  of  flow  through  each  node  and  arc 
respectively.  Equation  (C2.3)  ensures  that  flow  is  routed  between  each  s-t  pair, 
and  that  those  flows  are  raised  in  proportion  to  each  other.  Equation  (C2.4) 
places  a  limit  of  the  number  of  attacks  that  the  attacker  can  prosecute. 
Destroying  a  node  or  arc  drops  its  capacity  to  zero.  Equation  (C2.5)  states  that 
destroying  arc  (i,j)  also  destroys  arc  (y,  /). 

Solving  by  Total  Enumeration 

There  are  a  finite  number  of  Yk  and  Yt  variables  for  the  model.  Thus,  the 

optimal  “interdiction”  solution  can  be  determined  by  checking  all  possible  choices 
for  Yk  and  Y.  .  for  a  given  value  of  attacks ,  and  then  keeping  the  “best”  solution 

(i.e.,  the  solution  that  minimizes  the  maximum  flow  through  the  network  the  most, 
as  in  Rardin  et  al.,  1998). 

This  type  of  total  enumeration  works  for  problems  with  limited  size.  For  the 
network  operator  (the  defender),  there  are  e(e-1)  decision  variables,  one  for  each 
s-t  pair.  There  are  n+m  decision  variables  for  the  attacker. 

Discussion 

When  a  node  or  arc  is  attacked,  it  is  removed  from  the  network.  We  then 
use  the  Floyd-Warshall  algorithm  to  re-compute  the  shortest  paths  for  the 
remaining  s-t  pairs  so  that  the  new  rk'  and  q5/.  values  can  be  can  calculated. 

Changing  the  rk‘  and  q*’j  values  directly  impacts  flow  through  the 

network,  as  measured  by  p.  In  some  instances  p  will  decease,  as  expected. 
However,  in  other  instances  p  may  actually  increase.  This  is  the  converse  of 
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Braess’s  paradox,  which  states  that  adding  additional  capacity  to  a  network  can 
reduce  the  network’s  total  flow  (see  Florian  and  Hearn  1995).  In  our  case  it  is 
possible  that,  by  attacking  certain  nodes  or  arcs,  bottlenecks  are  removed  from 
the  network  resulting  in  a  net  increase  in  flow  through  the  remaining  network. 

B.  MULTIPLE  PATH  MULTI-COMMODITY  MAXIMUM  FLOW  MODEL 

1.  Solving  for  the  Maximum  Flow 

This  model  represents  best-case  traffic  engineering  in  that  the  network  is 
able  to  route  traffic  along  multiple,  possibly  longer,  paths,  and  makes  better 
overall  use  of  network  resources.  Here,  we  modify  the  standard  LP  formulation  of 
the  Maximum  s-t  Flow  problem  (Appendix  D)  to  accommodate  multi-commodity 
flows  while  also  adding  a  proportionality  constraint  for  each  s-t  pair.  We  use  the 
technique  of  “node  splitting”  to  replace  the  capacity  of  a  node  with  a  capacitated 
arc  connecting  the  two  split  nodes.  In  this  manner,  all  capacities  are  represented 
as  arc  capacities.  Like  the  single-path  model,  the  goal  is  to  maximize  that  total 
amount  of  data  flow  through  the  network.  We  introduce  an  artificial  return  arc  for 
every  s-t  pair.  The  return  arcs  are  unbounded  (ust  =  qo  ),  but  must  adhere  to  the 

constant  of  proportionality.  We  maximize  the  sum  of  the  flows  on  the  return  arcs, 
again,  by  increasing  flow  along  all  of  them  in  proportion  to  each  other  until  one  of 
the  arcs  in  the  network  reaches  its  capacity. 

Formulation  3:  MAX  MP  (Maximizing  Multiple-Path  Flow) 

Index  Use 

i,  j,  k  e  N  Nodes 

(ij)  e  A  Directed  arc  from  node  /  to  node  j 

s,teEc:N  Source  and  terminal  nodes  in  the  set  of  “edge” 

nodes  E 
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Preprocessing 


Each  “internal”  node  k  e  E  is  split  into  two  nodes  {k,  k]  with 
directed  arc  (k,  k’)  connecting  them. 


Data 

Bk  Throughput  capacity  at  node  k  e  N  [flow] 

Ds  Demand  for  edge  node  s  e  E  [flow] 


Upper  bound  on  flow  from  node  /  to  node  j 
on  arc  (/',  j)  e  A  [flow] 


if  i  =  k,  j  =  k' 

otherwise 


Decision  Variables 

Xs.' “Internal”  flow  of  commodity  s-t  on  arc  (/,  j)  e  A  [flow] 

Zs,t  “Return”  flow  of  commodity  s-t  on  artificial  arc 

(t  ,s)e  A  [flow] 

Zs  t  =pDsDt  where  p  is  a  constant  of  proportionality. 
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Formulation  [dual  variables] 


max  Z  Zs,t  -  max  p  Z  DA, 

P  s,tsE  P  s,tsE 


(C3.0) 


s.t 


£  xl:) 

(k,  j)sA 


Z  x;:i  = 

(i,  k)sA 


zs ■' 

■  0 

~zs-‘ 


if  k  -  s 

if  k  *s,t  \/k  e  N,\/(s,t)eExE 
if  k-t 


t«n  (C3.D 


Z  <«,y  V(i,j)eA  [A,y](C3.2) 

s, 

Zs-‘-pDsD,=  0  V(s,f)e£x£  [//  ']  (C3.4) 

>  0,  Zs>i  >0,  p  U.R.S. 


Discussion 


Equation  (C3.0),  the  objective  function,  represents  the  sum  of  the  flows 
along  the  return  arcs  (t,  s).  We  maximize  the  objective  function  value  by 
increasing  the  proportionality  constant  p.  Equation  (C3.2)  is  a  balance  of  flow 
constraint.  Equation  (C3.2)  limits  the  amount  of  flow  on  each  arc.  Equation  (C3.4) 
ensures  that  flow  is  routed  along  each  return  arc  ( t ,  s),  and  that  those  flows  are 
raised  in  proportion  to  each  other. 

The  following  table  shows  the  number  of  decision  variables  and 
constraints  contained  in  multiple-path  model: 

Decision  Variables _ Constraints _ 

Flow  from  s  tot  e-(e-l)  Flow  Balance  n-e\eA) 

Arc  Flow  nve\e- 1)  Arc  Capacity  m 

_ Demand  e\e-\) _ 


The  multiple-path  model  is  a  linear  programming  formulation  that  we  solve 
using  General  Algebraic  Modeling  System  (GAMS)  software  and  the  Solver 
CPLEX.  The  effort  GAMS  requires  to  solve  the  multiple-path  model  grows 

significantly  with  the  number  of  s-t  pairs,  <?(e-1),  in  a  given  network. 
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2.  Minimizing  the  Maximum  Flow 


Consider  again  the  case  of  an  attacker  who  can  disable  a  finite  number  of 
network  components  and  seeks  to  damage  the  total  network  flow  in  the  worst 
possible  manner.  A  natural  choice  to  represent  the  effect  of  an  arc  attack  is  to  set 
the  capacity  of  the  attacked  arc  to  zero  (as  was  done  in  the  single-path  model). 
However,  an  equivalent  and  computationally  attractive  approach  is  to  assign  a 
penalty  cost,  vij,  to  attacked  arcs.  This  discourages  the  defender  from  sending 

flow  across  an  arc  that’s  been  destroyed.  To  avoid  attacked  arcs,  the  penalty 
cost  must  be  greater  the  one,  because,  if  v(.  .=1  the  defender  is  completely 

indifferent  to  sending  flow  across  the  interdicted  arc,  and  the  resulting  problem 
may  have  many  equivalent  optimal  solutions.  Thus  we  set  vij=2  if  arc  (/,  j)  is 

susceptible  to  being  attacked.  We  can  similarly  designate  an  arc  as  invulnerable 
by  setting  v.  =0.  In  this  model,  artificial  return  arcs  are  all  invulnerable. 

The  previous  formulation  is  the  same,  with  the  addition  of  the  following 
data  and  decision  variables. 

Formulation  4:  MIN-MAX  MP  (Minimizing  the  Maximizing  Multiple-Path 

Flow) 

Data 

v,j  Penalty  cost  for  arc  (/,  j)  e  A  [cost/flow] 

attacks  Number  of  arcs  the  attacker  can  destroy 

[cardinality] 

Decision  Variables 

Y(i  j)  Attacker  destruction  of  arc  (i,j)  e  A  [binary] 

ft  if  arc(i,  j)  is  destroyed 
0  otherwise 
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Min-Max  optimization  of  flowfdual  variables] 


min  < 

Ye  T 


max 

p 


s ,  teE  y 


z 

O'.  j)eA 


v.  .v: '  r  . 

UJ  UJ  hj 


Z  xl:‘,  -  Z  x\ 

(k,j)eA  ( i,k)eA 


s,  t 
k 


'Zs’f  if  k  =  s 

<0  if  k  *s,t  \/k  e  N, \/(s,t)eExE 
-ZSJ  if  k=t 


s,  teE 


Zs  t  -  pDsDt  =0 

XJ;j  >  0,  Zst  >0,  p  U.R.S. 


V  O',  j)  e  4 


V(s,t)eExE 


where  Y  e  T  = 


\  Z  Yu + Z  - attacks  (C4-4) 

Z  (z,y)G^  keN 

Yu j=Yj, ,  (C4.5) 

y,.j>y*.M  0.1}  V(/,y)€^ 


(C4.0) 

KiP.i) 

[Pi,j]  (C4.2)  > 
[//"']  (C4.3) 
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Taking  the  dual  of  the  min-max  formulation  yields  the  following 
formulation. 

Formulation  5:  MAX  MP  Dual  (Minimizing  the  Maximizing  Multiple-Path 

Flow) 

Min-Max  optimization  of  flow 


min  X  uUjp 

“'W  mLa 

(C5.0) 

s.t.  a/  -  a/  +  Pi  j  +  vi  jYi  j  >  0 

V(z',y)  e  A,\/s,t  e  N 

[-V;']  («•!) 

R 

*'*'  Co 

1 

R 

**  jo 

+ 

Co 

IV 

\/s,t  e  N 

[zSlf]  (C5.2) 

s,teE 

(C5.3) 

a /  U.R.S.,  jus-‘  U.R.S. 

V (s,t)  e  ExE 

o 

Al 

V(iJ)eA 

Y.  .  <  attacks 

O'JM 

(C5.4) 

YtJ  e  {0,1}  V(/',/)  e  A 

Discussion 

This  formulation  is  the  dual  of  the  maximizing  multiple-path  flow 
formulation.  The  formulation  consists  of  dual  variables  for  flow  balance:  equation 
(C5.1 ),  arc  capacity:  equation  (C5.2),  and  demand:  equation  (C5.3)  for  each  s-t 
pair.  Like  the  single-path  model,  the  attacker  desires  to  minimize  the  previously 
maximized  sum  of  traffic  routed  along  all  return  arcs  ( t ,  s )  in  the  network. 
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The  following  table  shows  the  number  of  decision  variables  and 
constraints  contained  in  multiple-path  MIP  model  (dual): 

Dual  Decision 

_ Variables _ Dual  Constraints _ 

Node  Flow  ire-{e- 1)  Flow  Balance  m-e-{e- 1) 

Upper  bound  m  Arc  Capacity  e-(e-l) 

_ Commodity  Flow  e-(e-l) _ Demand  e-(e-l) _ 


The  multiple-path  dual  model  is  a  mixed  integer  program  that  we  also 
solve  using  GAMS  and  the  CPLEX  Solver.  Again,  the  time  GAMS  requires  to 
solve  this  problem  grows  significantly  with  the  number  of  s-t  pairs  in  the  network. 

Working  with  Proportional  Flow 

If  a  user  is  disconnected  from  the  network  as  a  result  of  an  attack,  flow  to 
or  from  that  user  is  no  longer  possible.  Thus,  that  user  cannot  send  or  receive 
traffic,  so  Zs,t  =0.  Just  like  in  the  single-path  formulation,  flows  are  constrained  to 
be  proportional  to  each  other  (Zst  -  pDsDt=  0  ),  so  the  disconnection  of  a  single 

edge  node  from  the  network  effectively  sets  p= 0  and  all  flows  disappear.  In 
practice,  the  disconnection  of  a  user  does  not  preclude  other  users  from  sending 
and/or  receiving  traffic.  In  this  sense,  the  proportionality  constraint  used  here  is 
unrealistic.  In  order  to  facilitate  the  computation  of  a  more  reasonable  traffic 
response  to  an  attacked  network,  we  consider  the  following  model  re-formulation: 

Let  //’'be  the  proportionality  constant  for  a  single  s-t  pair.  We  modify  the 
equation  (C3.4) 

ZSJ -ps  ,DsDt=  0  V  (s,  t)eN 

And  add  an  additional  constraint 

ps'=pRSJ  V(s,t)eN 
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Where  Rs,t  =1  if  there  exists  a  path  connecting  node  s  to  node  t,  or  Rs,t  =0  if  no 
such  path  exists.  Thus  Rs,t  is  a  binary  value  that  indicates  whether  an  individual 
s-t  path  is  available  in  the  network. 

There  are  two  approaches  for  determining  the  Rs,t  values.  The  first  is  to 
let  them  be  binary  variables  and  have  the  model  determine  the  best  choices. 
(This  makes  the  primal  problem  a  MIP.)  A  drawback  with  this  approach  is  that 
while  the  model  will  never  allow  Rs,t  =  1  if  s  and  t  are  not  connected,  s-t  pairs  that 
are  connected  might  also  be  shut  off  proactively  in  order  to  provide  a  better 
solution  for  maximizing  flow  through  the  remaining  network  (again  the  Braess 
Paradox).  Such  a  solution  is  contrary  to  the  “fairness”  assumption  underlying  our 
use  of  proportional  flows. 

An  alternative  approach  is  to  pre-compute  the  Rs  t  values  using  a 
reachability  algorithm  (Appendix  E).  The  multiple-path  model  (Figure  3)  remains 
the  same,  with  the  addition  of  the  following  data. 

Formulation  6:  R-MAX  MP  (Revised  Maximizing  Multiple-Path  Flow) 

Calculated  Data 

Rs,t  Connection  between  node  s  and  node  t  [binary] 

( 1  if  node  t  is  "reachable"  from  node  s 

Rs’‘  = 

v0  if  node  t  is  not  "reachable"  from  node  5 
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Formulation 


max  Y  ZM 


=  max 

P 


p  Z  DA 


(C6.0) 


s.t 


Z 

(k,j)sA 


K!, 


Z 

(/,  *)e,4 


x°i  = 


zs,t 

<  0 

-Z"-' 


if  k  =  s 

if  k*s,t  \/k  e  N,\/(s,t)eExE 
if  k  =  t 


(C6.1) 


Z 

s, 


V(/,/)eZ  (C6.2) 


Zs  t -ps’rDsDt=  0 

ps'‘=pRs’‘ 


\/(s,t)eExE  (C6.3) 
\/ (s,  t)  e  ExE  (C6.4) 


p  U.R.S. 

X°;‘>  0,  Zsr>0,  tfs'e{0,l}  V(s,t)eExE 


Discussion 

In  summary,  MAX  MP  calculates  total  flow  through  the  network  under 
best-case  traffic  engineering.  MAX  MP  Dual  determines  the  optimal  arc(s)  to 
attack  in  order  to  reduce  flow  through  the  network  the  most.  And  finally,  R-MAX 
MP  calculates  total  flow  on  a  “damaged”  network.  Just  like  for  the  single-path 
model,  this  model  allows  us  to  study  networks  in  which  only  the  nodes  are 
capacitated  (w.  .  =qo),  or  when  just  the  arcs  are  capacitated  ( Bk  =oo),  or  when 

both  nodes  and  arcs  are  capacitated. 
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III.  ANALYZING  THE  ABILENE  NETWORK 


Abilene  is  the  high-speed  backbone  of  the  Internet2  educational  network, 
a  not-for-profit  advanced  networking  consortium  of  universities,  laboratories,  and 
government  agencies.  (Detailed  information  is  available  at 
http://www.internet2.edu/)  Figure  1  represents  Abilene’s  network  topology  (as  of 
2004).  We  use  the  models  developed  in  the  previous  chapter  to  examine  how 
different  methods  of  traffic  engineering  affect  the  carrying  capacity  of  the 
network,  as  defined  by  its  multi-commodity  maximum  flow. 


Intermountain 
GigaPoP 

Front  Range  _  Great  Plains 

GigaPoP 

Arizona  St  >^Owest  I  abs 
U.  Arizona 


-  U.  Memphis  ^  z  ;  Northern 


Lights 


Indiana  GigaPoP 
OARNET  NYSERNet 


Abilene  Connection 
Speeds 

155 

Mbps 

622 

Mbps 

~~i 

2488 

Mbps 

10,000  Mbps 

Florida  A&M 

•  5  r —  ^  . 

r  J  C^Tulane  Uy 


U.  So.  Miss. 


Figure  1 .  The  Abilene  Network 
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•  The  clouds  in  the  figure  are  customers,  either  campus  networks 
(white)  or  other  network  providers  (grey). 

•  Abilene  has  a  total  of  58  customers  and/or  peers.  CENIC,  ESnet, 
GEANT,  NYSERNet,  and  Oregon  GigaPoP  all  use  Abilene  at  more 
the  one  location.  We  treat  each  of  those  connections  as  multiple 
customers,  bringing  the  total  number  to  65. 

•  There  are  4,160  customer-to-customer  pairs  (e  (e-1)=  65  ■  64= 
4160). 

•  Each  of  the  eleven  circles  represents  a  transshipment  node, 
specifically  a  Juniper  T640  Router,  located  in  a  major  U.S.  city. 

•  The  arcs  are  undirected  with  line  colors  and  thickness  indicating 
traffic  capacity  (i.e.,  bandwidth),  which  we  use  as  a  proxy  for 
customer  demand  for  traffic  (i.e.,  the  demand  for  customer  s  to 
route  traffic  to  customer  t  is  equivalent  to  the  product  of  their 
bandwidth  capacity). 

•  There  exist  fourteen,  two-way  connections  amongst  the 
transshipment  routers. 

As  detailed  in  the  previous  chapter,  we  seek  to  maximize  the  amount  of 
traffic  carried  among  the  4,160  customer-to-customer  pairs.  The  MAX  SP 
represents  the  simplest  form  of  traffic  engineering  in  which  data  is  routed 
between  customers  via  the  single  “shortest”  path  as  seen  by  IP.  The  MAX  MP 
represents  the  best-case  scenario  for  traffic  engineering,  in  which  data  sent  from 
customer  to  customer  can  be  split  into  multiple  streams,  each  following  its  own 
path.  Sometimes  the  optimal  multiple  paths  are  longer  than  the  shortest  path, 
sometimes  they  are  the  same.  Both  formulations  raise  all  4,160  flows  in  the 
network  in  proportion  to  one  another  (via  the  constant  of  proportionality  p)  until  at 
least  one  of  the  network  components  reaches  capacity  and  becomes  saturated. 
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A. 


OPTIMAL  FLOWS  WITH  NODE  CAPACITIES 


In  practice,  both  nodes  and  arcs  are  capacitated,  but  here  we  will  focus  on 
the  throughput  capacity  of  transshipment  routers  in  the  network.  Here  the 
transshipment  routers  (the  nodes)  each  have  a  maximum  capacity  of  320,000 
megabits  per  second  (Mbps),  which  represents  the  highest  combination  of  line 
cards  supported  by  the  T640  Router  at  the  time  the  data  was  collected. 

1.  Maximum  Flow  through  Abilene 

The  total  amount  of  traffic  routed  between  customers  using  MAX  SP  and 
MAX  MP  is  630,941  Mbps  (C1.0)  and  738,442  Mbps  (C3.0)  respectively.  Those 
results,  along  with  the  transshipment  node  utilizations  are  displayed  in  the  table 
below. 


Utilization  of  an  internal  node  (router)  is  simply  the  percentage  of  that 
router’s  maximum  capacity  used  by  the  4,160  customer-to-customer  pairs  (MAX 

SP:  ^ - ,  MAX  MP:  — - ). 


B, 


hj 


NOTE:  The  units  on  the  “flow”  values  in  the  tables  and  figures  throughout  this 
chapter  are  in  Mbps. 

Table  1 .  Utilization  of  Abilene  Transshipment  Routers  Under  Maximum 

Flows 


MAX  SP  MAX  MP  Increase 


Total-Flow 

P 

630,941 

0.000035 

738,442 

0.000041 

15% 

15% 

ATLANTA 

0.742 

1 

26% 

CHICAGO 

0.652 

0.940 

29% 

DENVER 

0.578 

0.887 

31% 

HOUSTON 

0.608 

0.963 

36% 

INDIANAPOLIS 

0.528 

0.927 

40% 

KANSAS  CITY 

0.595 

1 

41% 

LOS  ANGELES 

0.439 

0.469 

3% 

NEW  YORK 

0.901 

0.939 

4% 

SEATTLE 

0.541 

0.633 

9% 

SUNNYVALE 

0.335 

0.395 

6% 
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WASHINGTON  DC 


1 


1 


0% 


The  less  restrictive  MAX  MP  achieves  a  15%  increase  of  flow  through  the 
network.  Also,  every  transshipment  router  is  utilized  more  than  it  is  in  MAX  SP, 
and  the  increases  vary  by  router. 


In  both  models,  the  Washington  D.C.  router  is  the  first  to  reach  its  capacity 

2X' 


s,t 


=  1)  and  thus  is  the  “bottleneck.”  It  is 


(MAX  SP:— - =1,  MAX  MP: 

Bk  uij 

preventing  a  further  increase  in  flow  (p).  In  the  multiple-path  model,  the  Atlanta 
and  Kansas  City  routers  are  also  saturated. 


Increasing  the  throughput  capacity  of  the  bottleneck  router(s)  in  the 
network  would  enable  an  increase  in  flow  through  the  network.  For  example,  if 
we  could  double  the  capacity  of  Washington  D.  C-  (  B Washington  DC  =640,000  MbPS, 

perhaps  by  operating  two  Juniper  T640  routers  in  parallel)  we  would  increase 
flow  10%  (p=. 000039)  for  single-path  routing.  Doing  the  same  to  Sunnyvale 
instead  produces  a  0%  flow  increase. 


In  practice,  it  may  not  be  feasible,  or  necessary,  to  increase  the  capacity 
of  every  transshipment  router  in  order  to  improve  total  throughput,  thus 
identifying  the  bottleneck(s)  is  significant. 


The  next  two  figures  demonstrate  the  actual  data  flows  between  the 
transshipment  routers.  The  bold  italicized  number  adjacent  to  the  router  is  the 
sum  of  the  demands  of  the  customers  located  at  that  particular  router.  The 
number  in  each  node  is  its  utilization  (expressed  as  a  fraction  of  its  capacity) 
under  maximum  flow  conditions.  These  numbers  correspond  to  the  values  in 
Table  1.  Not  shown  is  the  data  flow  between  customers  who  use  Abilene  at  the 
same  transshipment  router. 
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Figure  2.  Abilene  Single-Path  Flow  through  Nodes 
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Figure  3.  Abilene  Multiple-Path  Flow  through  Nodes 

As  expected,  the  flow  levels  in  MAX  SP  (Figure  2)  are  symmetric  since  the 
shortest  paths  between  the  transshipment  routers  are  also  symmetric. 

In  Figure  3,  the  flow  levels  on  each  arc  are  no  longer  symmetric  since 
multiple-path  model  uses  all  available  capacity,  even  if  not  on  the  shortest  path. 

The  difference  in  the  flow  values  between  the  two  figures  on  the 
transshipment  connections  can  be  explained  by  the  MAX  MP’s  ability  use  longer 
and/or  multiple  routes  for  sending  traffic  between  customers. 

The  next  two  figures  show  the  paths  for  data  destined  to  the  New  York 
(dashed  green  arrows)  and  Sunnyvale  (solid  blue  arrows)  routers  from  the  other 
routers  in  the  network  (MAX  SP:  j ,  MAX  MP:  where  t=New  York  and 

Sunnyvale). 
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Figure  4.  Single-Path  Flow  to  New  York  &  Sunnyvale  Routers 


Figure  5.  Multiple-Path  Flow  to  New  York  &  Sunnyvale  Routers 
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The  total  amounts  of  traffic  traveling  to  New  York  and  Sunnyvale  are 
shown  in  Table  2  (MAX  SP:^X?',  MAX  MP:^ZM  where  t=New  York  and 

s  s 


Sunnyvale).  The  numbers  shown  simply  reflect  the  differences  in  p  values 
between  the  single-path  and  multiple-path  solutions. 

Table  2.  Traffic  Flow  to  New  York  &  Sunnyvale  Routers 


Single-Path _ Multiple-Path 


To 

NEW  YORK 

To 

SUNNYVALE 

To 

NEW  YORK 

To 

SUNNYVALE 

From  ATLANTA 

7,357 

2,271 

8,610 

2,658 

CHICAGO 

15,438 

4,765 

18,068 

5,577 

DENVER 

2,171 

670 

2,541 

784 

HOUSTON 

2,338 

722 

2,737 

845 

INDIANAPOLIS 

4,850 

1,497 

5,676 

1,752 

KANSAS  CITY 

1,671 

516 

1,956 

604 

LOS  ANGELES 

13,598 

4,197 

15,915 

4,912 

NEW  YORK 

10,201 

11,939 

SEATTLE 

21,677 

6,691 

25,370 

7,831 

SUNNYVALE 

10,201 

11,939 

WASHINGTON  DC 

35,725 

11,027 

41,812 

12,906 

Abilene  data  reduction 

Both  the  single-path  (EXCEL)  and  multiple-path  (GAMS)  models  take  a 
considerable  amount  of  time  to  the  execute  Abilene  data.  4,160  customer-to- 
customer  paths  translates  into  a  large  number  of  decision  variables  and 
constraints.  An  example  of  this  is  shown  in  the  table  below. 


Single-Path  (Figure  1)  TOTAL 


Decision  Variables  Customer-to-Customer  Pairs 

4,160 

Variables 

4,160 

Constraints 

Router  Capacity 

11 

Constraints 

4,329 

Arc  Capacity 

158 

Flow  Proportionality 

4,160 

Multiple-Path  (Figure  3) 

Decision  Variables 

Flow  on  Return  Arcs 

4,160 

Variables 

4,329 

Flow  through  Nodes 

169 

Constraints 

Balance  of  Flow 

361,920 

Constraints 

366,249 

Arc  Capacity 

169 

Flow  Proportionality 

4,160 
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We  can  significantly  reduce  the  number  of  customer-to-customer  paths  if 
we  only  consider  paths  between  the  eleven  transshipment  routers.  The  demand 
( Bk )  at  each  router  can  be  aggregated  from  the  sum  of  the  demands  of  that 

router’s  customers  (same  values  in  Figure  2  and  Figure  3).  However,  this 
reduction  does  not  account  for  the  traffic  routed  between  customers  who  use  the 
same  router.  So  we  leave  those  paths  in  our  data  reduction  (i.e. ,  for  example, 
keep  the  paths  from  University  of  Hawaii  to  Pacific  Northwest  GigaPoP  and  to 
Pacific  Wave,  but  get  rid  of  the  paths  to  sixty-two  paths). 

We  reduce  the  number  of  customer-to-customer  paths  to  462  (110  router- 
to-router  paths  plus  362  total  “local”  customer  paths). 


Single-Path  (Figure  1) 

TOTAL 

Decision  Variables  Customer-to-Customer  Pairs 

462 

Variables 

4,160 

Constraints 

Router  Capacity 

11 

Constraints 

631 

Arc  Capacity 

158 

Flow  Proportionality 

462 

Multiple-Path  (Figure  3) 

Decision  Variables 

Flow  on  Return  Arcs 

462 

Variables 

631 

Flow  through  Nodes 

169 

Constraints 

Balance  of  Flow 

40,194 

Constraints 

40,825 

Arc  Capacity 

169 

Flow  Proportionality 

462 

This  reduction  dramatically  improves  the  model  run  times,  from  minutes  to 
seconds. 

2.  Single  Node  Attack 

Now  we  consider  the  impact  of  losing  one  of  the  transshipment  routers. 
Causes  for  a  losing  a  router  range  from  equipment  failure  to  a  deliberate  attack. 
When  a  router  is  lost,  its  throughput  capacity  goes  to  zero  (^=0)  making  it 

unavailable  to  the  network.  Thus  customers  connected  to  that  router  are  no 
longer  able  to  send  and  receive  traffic  from  the  other  customers. 
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The  top  five  “optimal”  router  attacks  obtained  via  enumeration  (  here, 


^10 
v 1  j 


combinations  )  for  single-path  routing  are  calculated  using  MIN-MAX  SP  and 
appear  in  the  table  below.  Also  shown  are  the  percentage  changes  to  total  of 
flow  between  customer-to-customer  pairs  (C2.0)  and  flow  through  the  network  (p) 
after  a  particular  router  attack  (  yk= i)- 


Recall  from  Chapter  II  that  after  attack  has  occurred,  the  amount  of  flow 
through  the  network  (p)  adjusts  to  accommodate  the  “new”  capacity  and  demand 
constraints. 

Table  3.  Top  5  Single-Node  Attacks  Under  Single-Path  Routing 


Router 

Total  Flow 

P 

1 

INDIANAPOLIS 

431,804 

-32% 

0.000026  -26% 

2 

CHICAGO 

438,168 

-31% 

0.000030  -14% 

3 

ATLANTA 

458,729 

-27% 

0.000028  -20% 

4 

KANSAS  CITY 

512,749 

-19% 

0.000029  -17% 

5 

WASHINGTON  DC 

516,005 

-18% 

0.000051  31% 

The  top  five  optimal  router  attacks  to  multiple-path  routing  are  obtained  by 
solving  MAX  MP  DUAL.  The  values  from  equation  (C5.0),  the  minimized 
maximum  flow,  are  shown  in  the  next  table. 

Table  4.  Minimized  Total  Flow 


Router 

Minimized  Total  Flow 

1 

WASHINGTON  DC 

98,442 

2 

NEW  YORK 

142,047 

3 

CHICAGO 

300,106 

4 

SEATTLE 

333,428 

5 

INDIANAPOLIS 

391,222 

We  compute  attacks  2  through  5  by  making  the  previous  router(s)  that  were 
attacked  invulnerable  (i.e.,  for  example,  vWashtngtonDC ^Wash ingtonDC,  =0  allows  use  to 

determine  the  second  best  router  attack  plan). 
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After  the  flow  through  the  network  adjusts  to  accommodate  the  new 
capacity  and  demand  constraints  after  a  particular  router  attack  (Ykk,  =  1),  we  use 

R-MAX  MP  to  compute  total  of  flow  between  customer-to-customer  pairs  (C6.0) 
and  flow  through  the  network  (p).  Those  results  are  shown  in  the  next  table. 

Table  5.  Top  5  Single-Node  Attacks  Under  Multiple-Path  Routing 


Router 

Total  Flow 

P 

1 

INDIANAPOLIS 

431,804 

-42% 

0.000026  -37% 

2 

CHICAGO 

438,168 

-41% 

0.000030  -27% 

3 

ATLANTA 

458,729 

-38% 

0.000028  -32% 

4 

WASHINGTON  DC 

516,005 

-30% 

0.000051  19% 

5 

NEW  YORK 

588,184 

-20% 

0.000055  25% 

After  flow  through  the  network  is  adjusted,  Indianapolis  becomes  the 
worst.  Notice,  in  both  models,  the  changes  in  total  of  flow  differ  from  the  from  the 
changes  in  flow  through  the  network,  as  measured  by  p. 

Attacks  to  Indianapolis,  Chicago,  and  Atlanta  are  the  most  devastating. 
The  loss  of  those  routers  reduces  (but  not  eliminates,  see  figure  13)  the 
network’s  “path  diversity”  such  that  the  R-MAX  MP  now  only  uses  single  paths 
when  routing  traffic. 

Notice  in  the  tables  above  that  after  a  loss  of  the  Washington  D.C.  router 
in  MIN-MAX  SP,  and  a  loss  of  the  Washington  D.C.  and  New  York  routers  in  R- 
MAX  MP,  flow  through  the  remaining  network  actually  increases.  This  is  again  an 
example  of  Braess’s  paradox  (discussed  in  Chapter  II).  By  removing  the  large 
demand  associated  with  Washington  D.C.  customers,  DWashingtonDC=  0  instead  of 

33,217  Mbps,  and  it  becomes  possible  to  raise  flow  in  the  network  from 
,000035(single-path)  and  .000041  (multiple-path)  to  .000051. 

The  next  table  that  shows  the  benefits  of  R-MAX  MP  over  MIN-MAX  SP  in 
terms  of  total  flow  in  the  presence  of  a  router  attack. 
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Table  6. 


Single-Path  vs.  Multiple-Path  Flow  Following  a  Router  Attack 


Single-Path 

Multiple-Path 

Multiple-Path  Increase 

Pre-Attack  Flows: 

630,941 

738,442 

15% 

Post-Attack  Flows 

Router  Loss: 

ATLANTA 

458,729 

458,729 

0% 

CHICAGO 

438,168 

438,168 

0% 

DENVER 

516,354 

642,818 

20% 

HOUSTON 

606,461 

619,689 

2% 

INDIANAPOLIS 

431,804 

431,804 

0% 

KANSAS  CITY 

512,749 

592,952 

14% 

LOS  ANGELES 

629,785 

679,827 

7% 

NEW  YORK 

536,860 

588,184 

9% 

SEATTLE 

536,483 

649,799 

17% 

SUNNYVALE 

583,707 

691,886 

16% 

WASHINGTON  DC 

516,005 

516,005 

0% 

From  the  previous  example,  the  next  two  figures  show  how  data  flowing  to 
New  York  and  Sunnyvale  is  re-routed  following  the  Indianapolis  attack. 


BEFORE  AFTER 


Figure  6.  Single-Path  Flow  to  New  York  &  Sunnyvale  after  Indianapolis  Attack 
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BEFORE 


AFTER 


Figure  7.  Multiple-Path  Flow  to  New  York  &  Sunnyvale  after  Indianapolis  Attack 

We  observe  that  R-MAX  MP  still  uses  multiple  routes  when  sending  traffic 
to  Sunnyvale.  However  the  total  flow  calculation  (C2. 0=C6.0=431 ,804  Mbps)  in 
both  models  remains  the  same. 

The  total  amount  of  traffic  traveling  to  New  York  and  Sunnyvale  following 
the  Indianapolis  attack  is  also  the  same  in  both  models,  shown  in  the  table 
below.  By  comparing  the  values  to  values  in  Table  2,  we  observe  27%  flow 
decease  in  MIN-MAX  SP,  and  a  37%  decrease  in  R-MAX  MP. 


37 


Table  7. 


Traffic  Flow  to  New  York  &  Sunnyvale  Routers  after  Indianapolis 

Attack 


Single-Path  &  Multiple-Path 


To 

NEW  YORK 

SUNNYVALE 

From  ATLANTA 

5,398 

1,666 

CHICAGO 

11,328 

3,497 

DENVER 

1,593 

492 

HOUSTON 

1,716 

530 

INDIANAPOLIS 

KANSAS  CITY 

1,226 

379 

LOS  ANGELES 

9,978 

3,080 

NEW  YORK 

7,486 

SEATTLE 

15,906 

4,910 

SUNNYVALE 

7,486 

WASHINGTON  DC 

26,215 

8,092 

The  Washington  D.C.  transshipment  router  is  the  bottleneck  in  both 
models. 

3.  Multiple  Node  Attacks 

Here  we  extend  the  previous  analysis  to  the  case  where  the  number  of 
router  attacks  is  greater  than  one  (attacks>  1). 

According  to  MAX  MP  DUAL,  any  node  attack  that  splits  that  network  into 
more  than  one  piece  produces  an  objective  value  of  zero  (i.e., 
(C5.0)=  2>,.  jfi.  j  =0).  Thus,  a  solution  formulation  is  uninformative  because  we 

are  unable  to  observe  flow  through  the  remaining  network  (p).  As  a  result,  we 
compute  attacks  to  the  multiple-path  network  using  the  R-MAX  MP,  and  we 
determine  which  the  optimal  attacks  by  total  enumeration  (just  as  we  do  for  MIN- 
MAX  MP). 

An  inspection  of  figure  1  might  lead  one  to  suspect  that  the  optimal  two- 
router  attack  would  split  the  network  in  half  (i.e.,  Atlanta  and  Indianapolis,  Kansas 
City  and  Houston,  etc),  or  that  the  optimal  two  attacks  would  include  Indianapolis, 
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the  optimal  one-router  attack.  However,  the  optimal  two-router  attack  (out  of 


00 

v2y 


combinations)  for  both  models  is  Chicago  and  Seattle. 


The  resulting  network  from  the  Chicago  and  Seattle  attack  consists  of  nine 
routers  and  ten  arcs.  Thus,  the  only  routes  that  exist  between  the  transshipment 
routers  are  the  single  shortest  paths. 


The  optimal  three-router  attack  (out  of 


00 

v  2  y 


combinations  )  for  both 


models  is  Chicago,  Seattle  and  Los  Angeles. 

The  next  table  below  shows  the  total  amount  of  traffic  routed  between 
customers,  (C2.0)  and  (C6.0),  the  flow  through  the  network  (p),  and  the 

2X;  2X' 


transshipment  router  utilizations  (MAX  SP: 


B, 


,  MAX  MP: 


)  in  the  event 


07 


of  one,  two,  and  three  attacks. 


Table  8.  Optimal  Router  Attacks 


Single-Path 

Multiple-Path 

Both  Models 

number  of  Attacks 

0 

0 

1 

2 

3 

Total-Flow 

630,941 

738,442 

431,804 

404,454 

389,298 

P 

0.000035 

0.000041 

0.000026 

0.00004 

0.000051 

Router  Utilization 

ATLANTA 

0.742 

1 

0.745 

0.710 

0.590 

CHICAGO 

0.652 

0.940 

0.282 

0 

0 

DENVER 

0.578 

0.887 

0.333 

0.066 

0.295 

HOUSTON 

0.608 

0.963 

0.676 

0.534 

0.354 

INDIANAPOLIS 

0.528 

0.927 

0 

0.114 

0.125 

KANSAS  CITY 

0.595 

1 

0.344 

0.903 

0.326 

LOS  ANGELES 

0.439 

0.469 

0.382 

0.470 

0 

NEW  YORK 

0.901 

0.939 

0.765 

0.651 

0.294 

SEATTLE 

0.541 

0.633 

0.382 

0 

0 

SUNNYVALE 

0.335 

0.395 

0.239 

0.239 

0.254 

WASHINGTON  DC 

1 

1 

1 

1 

1 

1  router  attack:  There  is  a  32%  decrease  (i.e.,  from  (Cl  ,0)=630,941  to 
(C2.0)=431 ,804  Mbps)  and  42%  decrease  in  total  amount  of  traffic  routed 
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between  customers,  and  26%  and  37%  decrease  in  flow  through  the  network  in 
the  single-path  and  multiple-path  models,  respectively. 

1  router  attack  vs.  2  router  attack:  In  both  models,  the  total  amount  of 
traffic  routed  between  customers  deceases  6%,  while  flow  through  the  network 
increases  35%  (Braess’s  paradox). 

2  router  attack  vs.  3  router  attack:  In  both  models,  the  total  amount  of 
traffic  routed  between  customers  deceases  4%.  Here  flow  through  the  network 
increases  22%  (again,  Braess’s  paradox). 

The  Washington  D.C.  transshipment  router  is  the  bottleneck  in  both 
models  in  all  three  attacks.  Thus,  the  optimal  attacks  in  all  cases  do  not  include 
the  bottleneck.  Rather,  the  attacks  seem  to  redirect  flow  toward  the  bottleneck. 
The  bottlenecks  restrict  flow  through  the  network  and  thus  the  attacker  does  not 
want  to  eliminate  that  restriction. 

B.  FLOW  ON  CAPACITATED  ARCS 

In  this  section  of  the  analysis,  we  remove  the  capacity  constraint  on  the 
transshipment  routers  (Bk  =  go).  Now  only  the  fourteen  arcs  are  capacitated.  In 

reality,  the  connections  are  single  “duplex”  connections,  meaning  that  they 
support  traffic  flowing  in  both  directions.  Here,  we  treat  each  connection  as  a  pair 
of  directed  arcs  (i.e.,  Atlanta-to-Houston  and  Houston-to-Atlanta  as  different 
connections),  each  with  a  speed  of  10  gigabits  per  second  (Gbps)  (u;j=10,000 

Mbps).  The  arcs  connecting  customers  to  their  transshipment  router  remain  un¬ 
capacitated. 
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1. 


Maximum  Flow  through  Abilene 


The  tables  below  show  the  utilization  of  twenty-eight  transshipment  node 

connections  for  the  single-path  (  — - )  and  multiple-path  (  — - )  models,  as 

Uu  Uu 

well  as  the  total  amount  of  flow  between  customer-to-customer  pairs,  (C2.0)  and 
(C6.0),  and  flow  through  the  network  (p). 

Table  9.  Utilization  of  Abilene  Arcs  Under  Maximum  Flows 


Total-Flow 

P 

Single-Path 

67,802 

0.0000038 

Multiple-Path  1 
76,467 

0.0000042 

Multiple-Path  Increase 

11% 

10% 

ATL-HOUSTON 

1 

1 

0% 

ATL-INDY 

0.210 

0.805 

74% 

ATL-DC 

0.992 

1 

1% 

CHICAGO-INDY 

0.757 

0.972 

22% 

CHICAGO-NY 

0.781 

1 

22% 

DNVR-KC 

0.961 

1 

4% 

DNVR-SEATTLE 

0.723 

1 

28% 

DNVR-SUNNY 

0.199 

0.766 

74% 

HOUSTON-ATL 

1 

0.972 

-3% 

HOUSTON-KC 

0.384 

0.962 

60% 

HOUSTON-LA 

0.595 

0.987 

40% 

INDY-ATL 

0.210 

0.834 

75% 

INDY-CHICAGO 

0.757 

0.972 

22% 

INDY-KC 

0.620 

0.827 

25% 

KC-DNVR 

0.961 

1 

4% 

KC-HOUSTON 

0.384 

0.934 

59% 

KC-INDY 

0.620 

0.855 

27% 

LA-HOUSTON 

0.595 

0.987 

40% 

LA-SUNNY 

0.301 

0.461 

35% 

NY-CHICAGO 

0.781 

1 

22% 

NY-DC 

0.814 

0.862 

6% 

SEATTLE-DNVR 

0.723 

0.924 

22% 

SEATTLE-SUNNY 

0.168 

0.081 

-52% 

SUNNY-DNVR 

0.199 

0.843 

76% 

SUNNY-LA 

0.301 

0.461 

35% 

SUNNY-SEATTLE 

0.168 

0.005 

-97% 

DC-ATL 

0.992 

1 

1% 

DC-NY 

0.814 

0.862 

6% 
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The  next  table  compares  flow  through  the  capacitated-arc  network  to  flow 
through  the  capacitated-router  network. 

Table  10.  Arc  Capacity  vs.  Router  Capacity 


Single-Path 

Capacity 

Arcs 

Routers 

Decrease 

P 

0.0000038 

0.000035 

89% 

Multiple-Path 

Capacity 

Arcs 

Routers 

Decrease 

P 

0.0000042 

0.000041 

90% 

The  difference  in  network  flow  between  the  capacitated-arc  and 
capacitated-router  networks  is  over  89%  for  both  the  MAX  SP  and  MAX  MP. 
Thus  arcs  are  the  “severe”  constraints  on  these  max  flow  problems. 

Flow  levels  for  the  arcs  for  MAX  SP  are  symmetric,  as  expected. 

There  is  a  11%  increase  in  total  amount  of  flow  between  customer-to- 
customer  pairs  for  MAX  MP,  and  a  10%  increase  in  flow  through  the  network 
again  illustrating  the  limitations  of  the  single-path  traffic  routing. 

There  are  only  three  out  of  twenty-four  instances  where  an  arc  is  utilized 
more  in  the  MAX  SP  than  it  is  in  MAX  MP:  Houston-Atlanta,  Seattle-Sunnyvale, 
and  Sunnyvale-Seattle. 

The  bottlenecks  in  MAX  SP  are  the  Atlanta-Houston  and  Houston-Atlanta 

arcs. 

There  are  eight  bottlenecks  in  MAX  MP.  They  are  the  Atlanta-Houston, 
Chicago-New  York,  Denver-Kansas  City,  Denver-Seattle,  Kansas  City-Denver, 
New  York-Chicago,  Washington  D.C. -Atlanta  arcs. 
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2.  Single  Arc  Attack 


We  examine  the  impact  that  losing  one  of  the  arcs  (Yt  .  =1)  has  on  the 

total  amount  of  flow  between  customer-to-customer  pairs  in  both  models. 
Remember,  destroying  arc  (/,  j)  also  destroys  arc  (J,  /)  (  YiJ  =  YJJ ). 

Table  1 1 .  Single-Path  vs.  Multiple-Path  Flow  Following  an  Arc  Attack 


Single-Path  Multiple-Path 

Connection  Lost  Total  Flow  Net  Chg  Total  Flow  Net  Chg 


ATL-HOUSTON 

41,866 

-38% 

41,866 

-45% 

ATL-INDY 

69,701 

3% 

76,142 

-0.4% 

ATL-DC 

38,234 

-44% 

38,234 

-50% 

CHICAGO-INDY 

38,773 

-43% 

38,773 

-49% 

CHICAGO-NY 

38,234 

-44% 

38,234 

-50% 

DNVR-KC 

42,629 

-37% 

43,577 

-43% 

DNVR-SEATTLE 

67,802 

0% 

76,104 

-0.5% 

DNVR-SUNNY 

57,610 

-15% 

76,467 

0% 

HOUSTON-KC 

68,338 

1% 

76,467 

0% 

HOUSTON-LA 

43,577 

-36% 

43,577 

-43% 

INDY-KC 

41,866 

-38% 

41,866 

-45% 

LA-SUNNY 

53,707 

-21% 

53,707 

-30% 

NY-DC 

43,183 

-36% 

43,183 

-44% 

SEATTLE-SUNNY 

67,802 

0% 

76,104 

-0.5% 

The  optimal  attack  plan  for  both  models  is  to  attack  either  the  arcs 
between  Atlanta  and  Washington  D.C.,  or  between  Chicago  and  New  York. 

For  MIN-MAX  SP,  eliminating  Atlanta-lndianapolis  or  Houston-Kansas  City 
slightly  increases  total  flow  through  the  network  by  3%  and  1%  respectively 
(Braess’s  paradox). 

For  R-MAX  MP,  losing  either  the  Atlanta-lndianapolis  arcs  (decrease  < 
.4%),  Denver-Seattle  arcs  (decrease  <  .5%),  Denver-Sunnyvale  arcs,  Houston- 
Kansas  City  arcs,  or  Seattle-Sunnyvale  arcs  (decrease  <  .5%)  do  not  have  a 
noticeable  impact  on  total  flow  through  the  network.  Thus,  R-MAX  MP  is  more 
robust  to  attacks. 
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C.  FLOW  ON  CAPACITATED  NODES  AND  ARCS 

Now  we  look  at  Abilene  when  both  the  transshipment  routers  and  their 
twenty-eight  arcs  are  capacitated. 

The  routers  again  have  a  throughput  capacity  of  320,000  Mbps.  Arc 
speeds  of  10  Gbps  produce  the  same  results  listed  in  tables  8  and  9  which 
implies  that  the  arc  capacities  are  the  “real”  constraints  in  the  network.  Among 
the  eleven  transshipment  routers,  Washington  D.C.  is  utilized  the  most  in  both 
the  single-path  and  multiple-path  models  with  a  utilization  level  of  .107  and  .115 
respectively.  To  determine  the  number  of  arcs  in  parallel  (or  bandwidth  increase) 
required  to  saturate  a  transshipment  router,  we  uniformly  increase  the  arc 
capacities  until  one  of  the  routers  saturates. 

The  results  are  displayed  in  the  next  two  figures.  The  connection  speeds 
are  in  Mbps. 


Figure  8.  Single-Path  Utilization  vs.  Connection  Speed 
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Figure  9.  Multiple-Path  Utilization  vs.  Connection  Speed 


In  both  models,  we  can  increase  arc  capacity  to  100  Gbps  before  we 
saturate  a  transshipment  router.  Thus  the  connection  speed  required  to  saturate 
a  router  is  between  80  and  100  Gbps.  Washington  D.C.  is  the  router  that  is 
saturated  in  both  the  single-path  and  multiple-path  models. 

D.  CHAPTER  SUMMARY 

We  applied  the  two  models  representing  naive  traffic  engineering  (single¬ 
path  routing)  and  best-case  traffic  engineering  (multiple-path  routing)  to  analyze 
the  maximum  throughput  of  Abilene.  We  performed  our  analysis  with  node  and/or 
arc  capacities.  We  found  that  Abilene  is  well-provisioned  in  the  sense  that  it 
tends  to  be  the  arcs,  in  particular  the  customer  connections,  that  saturate  data 
flow  in  the  network,  a  generalization  that  is  consistent  with  our  results. 

For  both  the  single-path  and  multiple-path  optimal  solutions,  the 
Washington  D.C.  transshipment  router  is  the  bottleneck.  Increasing  the  line 
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speeds  of  its  connections  would  consequently  increase  the  total  amount  of  flow 
between  customer-to-customer  (single-path:  ,  multiple-path:  ^Z'1' )  and 

s,t  s,t 

flow  through  the  network  (p). 

Our  interdiction  analysis  shows  that  the  optimal  transshipment  router 
attack  is  to  remove  Indianapolis.  The  second  worst  single  router  attack  is 
Chicago,  which  is  involved  in  both  the  optimal  two-router  (Chicago,  Seattle)  and 
three-router  (Chicago,  Los  Angeles,  Seattle)  attacks.  The  optimal  single  arc 
attack  is  either  the  Atlanta-Washington  D.C.  or  Chicago-New  York  arc.  These 
results  suggest  the  importance  of  the  Indianapolis  and  Chicago  routers  to 
Abilene.  Perhaps  this  is  where  redundancy  should  be  built  into  the  network. 

We  conclude  that  Abilene  is  “over  provisioned”  in  terms  of  its  routers  and 
can  handle  increasing  connection  speeds  (i.e. ,  multiple  connections  in  parallel). 
Line  speeds  of  40  Gbps  (OC-768)  could  be  implemented  without  needing  to 
upgrade  the  routers. 
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IV.  SUMMARY  AND  CONCLUSIONS 


The  models  and  analysis  in  this  thesis  are  applicable  to  any  ISP  network. 
The  general  public,  businesses,  civilian  and  military  organizations  rely  heavily  on 
these  networks.  As  the  reliance  grows,  so  will  the  need  for  understanding  an 
ISP’s  limitations  and  vulnerability  to  attacks. 

The  interdiction  models  used  in  this  thesis  MIN-MAX  SP  and  MAX  MP 
Dual,  identify  the  attack  plan  that  reduces  the  maximum  amount  of  traffic  carried 
among  users  in  the  worst  possible  way.  One  could  take  our  analysis  a  step 
further  by  conducting  defender-attacker-defender  analysis  where  the  defender 
(network  operator)  first  decides  which  network  components  to  protect,  and  study 
how  those  decisions  effect  the  attacker’s  plan  (Brown  et  al.  2006). 

Future  work  will  need  to  study  alternatives  to  the  gravity  model  because  of 
the  difficulties  that  arise  when  using  it.  For  any  multi-commodity  gravity  model 
network,  flow  through  the  network  goes  to  zero  if  a  single  node  cannot  meet  its 
demand.  So  interdicting  a  multi-commodity  gravity  model  network  is  simple,  just 
“disconnect”  any  node  from  the  network.  We  avoid  this  in  the  single-path  model 
by  solving  for  the  p  for  each  node  k  algebraically  where  Bk  >0.  Our  revised 

multiple-path  formulation  (figure  6)  allows  us  to  get  around  that  for  the  multiple- 
path  models.  However,  as  demonstrated  in  Chapter  III,  there  are  often  cases 
where  Braess’s  paradox  occurs.  It  is  unsettling  that  flow  through  the  network 
could  rise  after  an  attack. 
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APPENDIX  A:  FLOYD-WARSHALL  ALGORITHM 


The  Floyd-Warshall  algorithm  computes  the  shortest  path  between  each 
pair  of  nodes  in  a  network  (R.  Ahuja  et  al.  1993).  The  algorithm  builds  an  n  *  n 
matrix  of  shortest  path  distances,  d(/,  j),  for  each  node  pair,  as  well  as  an  n  *  n 
matrix  of  predecessor  nodes,  pred(/,  j),  for  each  node  in  a  particular  path.  If  no 
path  exists,  the  distance  is  reported  as  qo,  and  the  predecessor  is  null.  Floyd- 
Warshall  runs  in  0(«3)operations. 


algorithm  Floyd-Warshall; 
begin 

for  all  node  pairs  (/',  j)  e  N  x  N  do 
d  (/,)):  =  oo  and  pred (/,_/):  =  0; 
for  all  nodes  i  e  N  do  d(/,  /):  =  0; 
for  each  arc(/',  j)  e  A  do  d(/',  j):  =  nC  and  pred(/',  j):  =  /'; 
for  each  node  k\  1  to  n  do 

for  each  (/,  j)  e  N  x  N  do 
if  d (/,  y)  >  d(/',  k )  +  d(k,j)  then 
begin 

d =  d(/,  k)  +  d(/c,y); 
pred(/',v):  =  pred(/c,  j); 

end; 

end; 
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APPENDIX  B:  COMPUTING  R1;, 


The  following  is  the  algorithm  builds  a  matrix  (e(e-1)  rows  *  n  columns)  of 
rks  t  values  (binary)  for  use  in  the  single-path  formulations.  It  uses  as  input  the 

pred(/',  j)  matrix  obtain  from  Floyd-Warshall.  The  algorithm  runs  in  e(e-1)  ■  n 
operations. 


algorithm  Compute  rks  t 


begin 

for  all  (s,  t)  e  Ex  E  do 
for  all  k  e  N  do 

K,=  o; 

end; 

begin 

for  s  e  {1,  2,...,  E}  do 

for  t  e  {1, 2,...,  £}  do 
k  =  f; 
do  { 


end; 


k  =  pred(s,  /e); 
}  while  (k  ^  s) 
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APPENDIX  C:  COMPUTING  <; 


The  following  is  the  algorithm  builds  a  matrix  (<?(<?-1)  rows  x  m  columns)  of 
q*'j  values  (binary)  for  use  in  the  single-path  formulations.  The  algorithm  uses  as 

input  the  pred(/,  j)  matrix  obtained  from  Floyd-Warshall.  The  algorithm  runs  in 
5(5-1 )  ■  m  operations. 


algorithm  Compute  qs.f.  ; 

begin 

for  all  (s,  t)  e  Ex  E  do 

for  each  arc  (i,  j)  e  A  do 

C=°; 

end; 

begin 

k  =  t; 

do  { 

if  pred(s,  k)  =  j  then; 

if  pred(s,  j)  =  i  then; 
=  1; 

k  =  pred(s,  i); 

}  while  (k  *s) 

end; 
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APPENDIX  D:  STANDARD  LP  MAX  FLOW 

The  standard  LP  maximum  flow  formulation  adds  a  unbounded  return  arc, 
arc  (t,  s),  to  a  network  and  then  maximums  flow  on  that  arc.  The  rest  of  the 
formulation  consist  of  balance  of  flow  and  arc  capacity  constraints. 

Index  Use 

i,  j  eN 
( Uj )  e  A 
s,t 

Data 

UUi 

Decision  Variables 

xuj 

Formulation 

max  Xt  s 

st-  Z  xu  -  Z  xu  = 0  ViEiV 

(X.V,,.  <«/,,.  V (/',/)  e  A 


Nodes 

Directed  arc  from  node  /  to  node  j 
Source  and  terminal  nodes 

Upper  bound  on  flow  from  node  /  to  node  j 
on  arc  (/,  j)  e  A  [flow] 

Flow  on  directed  arc  (i,j)  e  A  [flow] 
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APPENDIX  E:  REACHABILITY  FORMULATION  /  ALGORITHM 


We  determine  if  source  node  s  is  able  to  reach  terminal  node  t  after  an 
attack  has  occurred  for  all  s-t  pairs  in  the  network.  Since  we  solve  the  multiple- 
path  formulations  in  GAMS,  we  also  use  GAMS  to  determine  node  reachabilty  by 
solving  another  LP  formulation  (as  opposed  to  a  standard  reachability  algorithm 
which  GAMS  would  execute  very  slowly). 

Index  Use 

i,  j,  k  e  N  Nodes 

(/,/)  e  A  Directed  arc  from  node  /  to  node  j 

s,t^E^N  Source  and  terminal  nodes  in  the  set  of  “edge” 

nodes  E 

Data 

Y*j  Binary  indicator  whether  the  attacker  destroyed 

of  arc  (/,  j)  e  A 

,  J 1  if  arc  (/,  j)  was  destroyed 
'  [0  otherwise 

Decision  Variable 

Ws  t  Flow  from  node  s  to  node  t  [Flow] 

Q] .  Flow  from  node  s  on  arc  (/,  j)  e  A  [Flow] 
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Formulation 


max 


2X, 


s.t. 


Z  a.. 


(i,y)MK  ,=0 


-  Z  e;., 

(i,y)e^|^y=0 


Z^j  if^=i 

j 

<  0  V  seE,i&N 

-W,  ,if s*i 


Discussion 

By  maximizing  Wsl  for  all  s  and  t,  we  simultaneously  determine  whether 
we  can  send  flow  from  s  to  t  along  the  “surviving”  arcs  in  the  network  (Q-j).  The 
dual  of  the  multiple-path  maximum  flow  model  (figure  5)  determines  the  Y*j 
values.  =1  implies  arc  (i,  j)  is  attacked  and  thus  has  zero  capacity  for  flow,  and 
Y*.=0  implies  arc  (i,  j)  survived  the  attack. 

Once  we  have  established  which  nodes  t  are  reachable  from  which  nodes 
s  (i.e.,  if  a  path  from  s  to  t  exist),  we  are  able  to  compute  the  Rs  t  values  used  in 
the  revised  multiple-path  max  flow  model  (figure  6)  with  a  simple  algorithm. 

algorithm  Compute  Rs,t  ; 

begin 

for  all  (s,  t)  e  E  do 

if  W*  > 0  then; 

iT'  =  1; 

else  Rs’’=  0; 

end; 
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